<?xml version="1.0" encoding="ISO-8859-1"?>
<metadatalist>
	<metadata ReferenceType="Conference Proceedings">
		<site>sibgrapi.sid.inpe.br 802</site>
		<holdercode>{ibi 8JMKD3MGPEW34M/46T9EHH}</holdercode>
		<identifier>8JMKD3MGPEW34M/45CKPKE</identifier>
		<repository>sid.inpe.br/sibgrapi/2021/09.04.20.35</repository>
		<lastupdate>2021:09.04.20.35.48 sid.inpe.br/banon/2001/03.30.15.38 administrator</lastupdate>
		<metadatarepository>sid.inpe.br/sibgrapi/2021/09.04.20.35.48</metadatarepository>
		<metadatalastupdate>2022:06.14.00.00.25 sid.inpe.br/banon/2001/03.30.15.38 administrator {D 2021}</metadatalastupdate>
		<doi>10.1109/SIBGRAPI54419.2021.00052</doi>
		<citationkey>SantosSiDaRoDrDu:2021:FoUnAp</citationkey>
		<title>A Form Understanding Approach to Printed and Structured Engineering Documentation</title>
		<format>On-line</format>
		<year>2021</year>
		<numberoffiles>1</numberoffiles>
		<size>34470 KiB</size>
		<author>Santos, Gabriel Lavoura dos,</author>
		<author>Silva, Vanessa Telles da,</author>
		<author>Dalmolin, Laura de Aguiar,</author>
		<author>Rodrigues, Ricardo Nagel,</author>
		<author>Drews Jr, Paulo Lilles Jorge,</author>
		<author>Duarte Filho, Nelson Lopes,</author>
		<affiliation>Universidade Federal do Rio Grande, Brazil </affiliation>
		<affiliation>Universidade Federal do Rio Grande, Brazil </affiliation>
		<affiliation>Universidade Federal do Rio Grande, Brazil </affiliation>
		<affiliation>Universidade Federal do Rio Grande, Brazil </affiliation>
		<affiliation>Universidade Federal do Rio Grande, Brazil </affiliation>
		<affiliation>Universidade Federal do Rio Grande, Brazil</affiliation>
		<editor>Paiva, Afonso ,</editor>
		<editor>Menotti, David ,</editor>
		<editor>Baranoski, Gladimir V. G. ,</editor>
		<editor>Proença, Hugo Pedro ,</editor>
		<editor>Junior, Antonio Lopes Apolinario ,</editor>
		<editor>Papa, João Paulo ,</editor>
		<editor>Pagliosa, Paulo ,</editor>
		<editor>dos Santos, Thiago Oliveira ,</editor>
		<editor>e Sá, Asla Medeiros ,</editor>
		<editor>da Silveira, Thiago Lopes Trugillo ,</editor>
		<editor>Brazil, Emilio Vital ,</editor>
		<editor>Ponti, Moacir A. ,</editor>
		<editor>Fernandes, Leandro A. F. ,</editor>
		<editor>Avila, Sandra,</editor>
		<e-mailaddress>lavourasantos@gmail.com</e-mailaddress>
		<conferencename>Conference on Graphics, Patterns and Images, 34 (SIBGRAPI)</conferencename>
		<conferencelocation>Gramado, RS, Brazil (virtual)</conferencelocation>
		<date>18-22 Oct. 2021</date>
		<publisher>IEEE Computer Society</publisher>
		<publisheraddress>Los Alamitos</publisheraddress>
		<booktitle>Proceedings</booktitle>
		<tertiarytype>Full Paper</tertiarytype>
		<transferableflag>1</transferableflag>
		<versiontype>finaldraft</versiontype>
		<keywords>form understanding, text detection, spatial layout analysis.</keywords>
		<abstract>A significant amount of companies still depends on printed documents, such as healthcare reports, engineering specifications, or historical documents. Those documents are diverse in terms of layout and content, thereby it requires different approaches for each document structure, which makes information extraction a costly and inefficient task. We classify documents into three categories,  non-structured, semi-structured, and structured documents. The last one being the focus of the present work.We propose a pattern recognition method for structured documents with an anchoring relationship between question-answer objects through a system of hypotheses and a probability distribution in order to identify which predefined model the document belongs to. Therefore, acting as a system for both identification and content extraction to structured documents. The method has promising results for pattern recognition from all document models, with 78% to 97% objects extracted correctly.</abstract>
		<language>en</language>
		<targetfile>Sibgrapi_2021 - Paper ID 64.pdf</targetfile>
		<usergroup>lavourasantos@gmail.com</usergroup>
		<visibility>shown</visibility>
		<documentstage>not transferred</documentstage>
		<mirrorrepository>sid.inpe.br/banon/2001/03.30.15.38.24</mirrorrepository>
		<nexthigherunit>8JMKD3MGPEW34M/45PQ3RS</nexthigherunit>
		<nexthigherunit>8JMKD3MGPEW34M/4742MCS</nexthigherunit>
		<citingitemlist>sid.inpe.br/sibgrapi/2021/11.12.11.46 5</citingitemlist>
		<hostcollection>sid.inpe.br/banon/2001/03.30.15.38</hostcollection>
		<agreement>agreement.html .htaccess .htaccess2</agreement>
		<lasthostcollection>sid.inpe.br/banon/2001/03.30.15.38</lasthostcollection>
		<url>http://sibgrapi.sid.inpe.br/rep-/sid.inpe.br/sibgrapi/2021/09.04.20.35</url>
	</metadata>
</metadatalist>